2, 10, 14, 20, 32, 40, 46, 48, 60
2, 10, 14, 20, 42
1. Two friends, call them Shane and Brittany, repeatedly play a game 20 times. Let X = the number of times that Brittany wins the game. Consider the following rule for deciding whether or not the game is fair:
Judge the game to be fair if 5<=X<=15
Judge the game to be biased if either X<=4 or X>=16What is the probability of judging the game to be biased when it is actually fair? (10)
Assuming the game is fair, what is the probability that Brittany wins at most 12 times? (5)
Assuming the game is fair, what is the probability that Brittany wins more than 12 times? (5)
Assuming the game is fair, what is the probability that Brittany wins less than 12 times? (5)
Assuming the game is fair, what is the probability that Brittany wins exactly 8 times? (5)
2. Data from blood samples are stored in the file P:\data\math\stats\BloodLevels.csv. The response variable of interest is the level of high-density lipoprotein (HDL) in a blood sample. This is a component of cholesterol often known as "good" cholesterol. In practice it is difficult (and expensive) to measure the HDL level directly from a blood sample. Three other quantities are easier (and cheaper) to determine. They are the total cholesterol in the sample (Chol), the total triglycerides in the sample (Triglyc), and the presence (1) or absence (0) of a sticky substance called sinking pre-beta (SPB). The data below are from the analysis of blood samples from 21 volunteers. Units for the levels are milligrams per deciliter of blood (mg/dl).
Compare the HDL levels when SPB is present and absent. Be sure to address center, shape, and spread of the two distributions. (15)
We are going to consider models for predicting the HDL levels, using any of the other variables in the data set as predictors. Create scatterplots of each of the potential predictors with HDL. (15)
Based on the information in the scatterplots, which of the potential predictors (Chol, Triglyc, SPB) is the weakest predictor (on its own) of the HDL response variable? (5)
Weakest predictor of HDL= _________
3. Install and load the package Stat2Data into your RStudio Session. Use the commands Data("Marathon") and str(Marathon) to create a data frame and examine its structure. Training times for a marathon runner are provided. The data, Miles run, Time (in minutes:seconds:hundredths), and running Pace (in minutes:seconds:hundredths) are given for a five-year period from 2002 to 2006. The Time and Pace have been converted to decimal minutes in TimeMin and PaceMin, respectively. The Brand of the running shoe is added for 2005 and 2006. Use graphical and numerical methods from Chapter 1 to investigate if a runner has a tendency to go faster on short runs (5 or less miles) than long runs. The varible Short in the data set is coded with 1 for short runs and 0 for longer runs. Assume that the data for this runner can be viewed as a sample for runners of a similar age and ability level. (40)
4. Install and load the package Stat2Data into your RStudio Session. Use the commands Data("Caterpillars") and str(Caterpillars) to create a data frame and examine its structure. Student and faculty researchers at Kenyon College conducted numerous experiments with Manduca Sexta caterpillars to study biological growth. A subset of the measurements from some of the experiments is in Caterpillars. The variables in the dataset include:
- Instar - a number from 1 (smallest) to 5 (largest) indicating stage of the caterpillar's life
- ActiveFeeding - an indicator (Y or N) of whether or not the animal is actively feeding
- Fgp - an indicator (Y or N) of whether or not the animal is in a free growth period
- Mgp - an indicator (Y or N) of whether or not the animal is in a maximum growth period
- Mass - body mass of the animal in grams
- Intake - food intake in grams/day
- WetFrass - amount of frass (solid waste) produced by the animal in grams/day
- DryFrass - amount of frass, after drying, produced by the animal in grams/day
- Cassim - CO2 assimilation (ingetion-excretion)
- Nfrass - Nitrogen in frass
- Nassim - Nitrogen assimilation (ingestion-excretion)
Produce a scatterplot for predicting WetFrass based on Mass. Comment on any patterns. (10)
Produce a similar plot using the log (base 10) transformed variables, LogWetFrass versus LogMass. Again, comment on any patterns. (10)
Which plot (or set of variables) would you prefer to predict the amount of wet frass produced for caterpillars? Explain. (10)
Create side-by-side boxplots to compare the distributions of Mass for all five instars. Describe any patterns you see. (15)
Create side-by-side boxplots to compare the distributions of Intake for all five instars. Describe any patterns you see. (15)